Overview

Brought to you by YData

Dataset statistics

Number of variables22
Number of observations1852394
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory310.9 MiB
Average record size in memory176.0 B

Variable types

DateTime2
Numeric9
Text8
Categorical3

Alerts

lat is highly overall correlated with merch_latHigh correlation
long is highly overall correlated with merch_long and 1 other fieldsHigh correlation
merch_lat is highly overall correlated with latHigh correlation
merch_long is highly overall correlated with long and 1 other fieldsHigh correlation
zip is highly overall correlated with long and 1 other fieldsHigh correlation
is_fraud is highly imbalanced (95.3%) Imbalance
amt is highly skewed (γ1 = 40.81280918) Skewed
trans_num has unique values Unique

Reproduction

Analysis started2025-04-28 08:25:33.185706
Analysis finished2025-04-28 08:28:23.420114
Duration2 minutes and 50.23 seconds
Software versionydata-profiling v0.0.dev0
Download configurationconfig.json

Variables

Distinct1819551
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Memory size14.1 MiB
Minimum2019-01-01 00:00:18
Maximum2020-12-31 23:59:34
Invalid dates0
Invalid dates (%)0.0%
2025-04-28T17:28:23.559958image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:28:23.753565image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

cc_num
Real number (ℝ)

Distinct999
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.1738604 × 1017
Minimum6.0416207 × 1010
Maximum4.9923464 × 1018
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.1 MiB
2025-04-28T17:28:23.950411image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum6.0416207 × 1010
5-th percentile6.3048488 × 1011
Q11.8004295 × 1014
median3.5214173 × 1015
Q34.6422555 × 1015
95-th percentile4.497914 × 1018
Maximum4.9923464 × 1018
Range4.9923463 × 1018
Interquartile range (IQR)4.4622125 × 1015

Descriptive statistics

Standard deviation1.3091153 × 1018
Coefficient of variation (CV)3.1364616
Kurtosis6.1753558
Mean4.1738604 × 1017
Median Absolute Deviation (MAD)3.0764709 × 1015
Skewness2.8510736
Sum5.0088429 × 1018
Variance1.7137828 × 1036
MonotonicityNot monotonic
2025-04-28T17:28:24.166953image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3.02704321 × 10134392
 
0.2%
6.538441737 × 10154392
 
0.2%
4.642255475 × 10154386
 
0.2%
6.538891243 × 10154386
 
0.2%
4.364010865 × 10154386
 
0.2%
6.011438889 × 10154385
 
0.2%
3.447098678 × 10144385
 
0.2%
4.512828415 × 10184384
 
0.2%
4.586810169 × 10154384
 
0.2%
4.745996322 × 10124384
 
0.2%
Other values (989) 1808530
97.6%
ValueCountFrequency (%)
6.041620718 × 10102196
0.1%
6.042292873 × 10102200
0.1%
6.042309813 × 1010738
 
< 0.1%
6.042785159 × 1010743
 
< 0.1%
6.048700208 × 1010735
 
< 0.1%
6.04905963 × 10101465
0.1%
6.049559311 × 1010742
 
< 0.1%
5.018029536 × 10112194
0.1%
5.018181333 × 10118
 
< 0.1%
5.018282048 × 1011733
 
< 0.1%
ValueCountFrequency (%)
4.992346398 × 10182922
0.2%
4.989847571 × 10181471
0.1%
4.980323468 × 1018736
 
< 0.1%
4.973530368 × 10181467
0.1%
4.958589672 × 10182191
0.1%
4.95682899 × 10183657
0.2%
4.911818931 × 10189
 
< 0.1%
4.906628656 × 10183655
0.2%
4.897067971 × 10181471
0.1%
4.890424427 × 10182189
0.1%
Distinct693
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.1 MiB
2025-04-28T17:28:24.437556image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length43
Median length36
Mean length23.130553
Min length13

Characters and Unicode

Total characters42846898
Distinct characters55
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfraud_Rippin, Kub and Mann
2nd rowfraud_Heller, Gutmann and Zieme
3rd rowfraud_Lind-Buckridge
4th rowfraud_Kutch, Hermiston and Farrell
5th rowfraud_Keeling-Crist
ValueCountFrequency (%)
and 677362
 
15.7%
llc 139662
 
3.2%
inc 131148
 
3.0%
sons 104651
 
2.4%
ltd 100896
 
2.3%
plc 94799
 
2.2%
group 72089
 
1.7%
fraud_kutch 15028
 
0.3%
fraud_schaefer 13367
 
0.3%
fraud_streich 13235
 
0.3%
Other values (804) 2956186
68.5%
2025-04-28T17:28:25.018342image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4158232
 
9.7%
r 3851348
 
9.0%
d 3055994
 
7.1%
e 2665745
 
6.2%
u 2654462
 
6.2%
n 2526397
 
5.9%
2466029
 
5.8%
f 1996096
 
4.7%
_ 1852394
 
4.3%
o 1614017
 
3.8%
Other values (45) 16006184
37.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 42846898
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 4158232
 
9.7%
r 3851348
 
9.0%
d 3055994
 
7.1%
e 2665745
 
6.2%
u 2654462
 
6.2%
n 2526397
 
5.9%
2466029
 
5.8%
f 1996096
 
4.7%
_ 1852394
 
4.3%
o 1614017
 
3.8%
Other values (45) 16006184
37.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 42846898
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 4158232
 
9.7%
r 3851348
 
9.0%
d 3055994
 
7.1%
e 2665745
 
6.2%
u 2654462
 
6.2%
n 2526397
 
5.9%
2466029
 
5.8%
f 1996096
 
4.7%
_ 1852394
 
4.3%
o 1614017
 
3.8%
Other values (45) 16006184
37.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 42846898
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 4158232
 
9.7%
r 3851348
 
9.0%
d 3055994
 
7.1%
e 2665745
 
6.2%
u 2654462
 
6.2%
n 2526397
 
5.9%
2466029
 
5.8%
f 1996096
 
4.7%
_ 1852394
 
4.3%
o 1614017
 
3.8%
Other values (45) 16006184
37.4%

category
Categorical

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.1 MiB
gas_transport
188029 
grocery_pos
176191 
home
175460 
shopping_pos
166463 
kids_pets
161727 
Other values (9)
984524 

Length

Max length14
Median length12
Mean length10.525913
Min length4

Characters and Unicode

Total characters19498139
Distinct characters20
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmisc_net
2nd rowgrocery_pos
3rd rowentertainment
4th rowgas_transport
5th rowmisc_pos

Common Values

ValueCountFrequency (%)
gas_transport 188029
10.2%
grocery_pos 176191
9.5%
home 175460
9.5%
shopping_pos 166463
9.0%
kids_pets 161727
8.7%
shopping_net 139322
7.5%
entertainment 134118
7.2%
food_dining 130729
 
7.1%
personal_care 130085
 
7.0%
health_fitness 122553
 
6.6%
Other values (4) 327717
17.7%

Length

2025-04-28T17:28:25.192836image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
gas_transport 188029
10.2%
grocery_pos 176191
9.5%
home 175460
9.5%
shopping_pos 166463
9.0%
kids_pets 161727
8.7%
shopping_net 139322
7.5%
entertainment 134118
7.2%
food_dining 130729
 
7.1%
personal_care 130085
 
7.0%
health_fitness 122553
 
6.6%
Other values (4) 327717
17.7%

Most occurring characters

ValueCountFrequency (%)
s 2042254
10.5%
e 1838696
9.4%
o 1758769
9.0%
n 1705118
8.7%
p 1548294
 
7.9%
t 1538055
 
7.9%
_ 1484860
 
7.6%
r 1310440
 
6.7%
i 1190524
 
6.1%
a 950855
 
4.9%
Other values (10) 4130274
21.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 19498139
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
s 2042254
10.5%
e 1838696
9.4%
o 1758769
9.0%
n 1705118
8.7%
p 1548294
 
7.9%
t 1538055
 
7.9%
_ 1484860
 
7.6%
r 1310440
 
6.7%
i 1190524
 
6.1%
a 950855
 
4.9%
Other values (10) 4130274
21.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 19498139
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
s 2042254
10.5%
e 1838696
9.4%
o 1758769
9.0%
n 1705118
8.7%
p 1548294
 
7.9%
t 1538055
 
7.9%
_ 1484860
 
7.6%
r 1310440
 
6.7%
i 1190524
 
6.1%
a 950855
 
4.9%
Other values (10) 4130274
21.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 19498139
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
s 2042254
10.5%
e 1838696
9.4%
o 1758769
9.0%
n 1705118
8.7%
p 1548294
 
7.9%
t 1538055
 
7.9%
_ 1484860
 
7.6%
r 1310440
 
6.7%
i 1190524
 
6.1%
a 950855
 
4.9%
Other values (10) 4130274
21.2%

amt
Real number (ℝ)

Skewed 

Distinct60616
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean70.063567
Minimum1
Maximum28948.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.1 MiB
2025-04-28T17:28:25.353315image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.44
Q19.64
median47.45
Q383.1
95-th percentile195.34
Maximum28948.9
Range28947.9
Interquartile range (IQR)73.46

Descriptive statistics

Standard deviation159.25397
Coefficient of variation (CV)2.2729927
Kurtosis4181.9073
Mean70.063567
Median Absolute Deviation (MAD)37.46
Skewness40.812809
Sum1.2978533 × 108
Variance25361.828
MonotonicityNot monotonic
2025-04-28T17:28:25.546447image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.14 779
 
< 0.1%
1.1 745
 
< 0.1%
1.04 744
 
< 0.1%
1.08 741
 
< 0.1%
1.25 737
 
< 0.1%
1.2 737
 
< 0.1%
1.02 736
 
< 0.1%
1.01 735
 
< 0.1%
1.22 727
 
< 0.1%
1.03 726
 
< 0.1%
Other values (60606) 1844987
99.6%
ValueCountFrequency (%)
1 332
< 0.1%
1.01 735
< 0.1%
1.02 736
< 0.1%
1.03 726
< 0.1%
1.04 744
< 0.1%
1.05 721
< 0.1%
1.06 671
< 0.1%
1.07 723
< 0.1%
1.08 741
< 0.1%
1.09 720
< 0.1%
ValueCountFrequency (%)
28948.9 1
< 0.1%
27390.12 1
< 0.1%
27119.77 1
< 0.1%
26544.12 1
< 0.1%
25086.94 1
< 0.1%
22768.11 1
< 0.1%
21437.71 1
< 0.1%
19364.91 1
< 0.1%
17897.24 1
< 0.1%
16837.08 1
< 0.1%

first
Text

Distinct355
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.1 MiB
2025-04-28T17:28:25.866609image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length9
Mean length6.0802977
Min length3

Characters and Unicode

Total characters11263107
Distinct characters49
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowJennifer
2nd rowStephanie
3rd rowEdward
4th rowJeremy
5th rowTyler
ValueCountFrequency (%)
christopher 38112
 
2.1%
robert 30743
 
1.7%
jessica 29236
 
1.6%
david 28564
 
1.5%
michael 28539
 
1.5%
james 28496
 
1.5%
jennifer 24181
 
1.3%
john 23445
 
1.3%
mary 23424
 
1.3%
william 23396
 
1.3%
Other values (345) 1574258
85.0%
2025-04-28T17:28:26.381420image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1438618
 
12.8%
e 1230164
 
10.9%
i 883628
 
7.8%
n 877668
 
7.8%
r 867952
 
7.7%
l 554750
 
4.9%
h 493347
 
4.4%
s 463151
 
4.1%
t 444904
 
4.0%
o 384330
 
3.4%
Other values (39) 3624595
32.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 11263107
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 1438618
 
12.8%
e 1230164
 
10.9%
i 883628
 
7.8%
n 877668
 
7.8%
r 867952
 
7.7%
l 554750
 
4.9%
h 493347
 
4.4%
s 463151
 
4.1%
t 444904
 
4.0%
o 384330
 
3.4%
Other values (39) 3624595
32.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 11263107
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 1438618
 
12.8%
e 1230164
 
10.9%
i 883628
 
7.8%
n 877668
 
7.8%
r 867952
 
7.7%
l 554750
 
4.9%
h 493347
 
4.4%
s 463151
 
4.1%
t 444904
 
4.0%
o 384330
 
3.4%
Other values (39) 3624595
32.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 11263107
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 1438618
 
12.8%
e 1230164
 
10.9%
i 883628
 
7.8%
n 877668
 
7.8%
r 867952
 
7.7%
l 554750
 
4.9%
h 493347
 
4.4%
s 463151
 
4.1%
t 444904
 
4.0%
o 384330
 
3.4%
Other values (39) 3624595
32.2%

last
Text

Distinct486
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.1 MiB
2025-04-28T17:28:26.841567image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length10
Mean length6.1123751
Min length2

Characters and Unicode

Total characters11322527
Distinct characters48
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBanks
2nd rowGill
3rd rowSanchez
4th rowWhite
5th rowGarcia
ValueCountFrequency (%)
smith 40940
 
2.2%
williams 33661
 
1.8%
davis 31434
 
1.7%
johnson 28590
 
1.5%
rodriguez 24879
 
1.3%
martinez 21246
 
1.1%
jones 19825
 
1.1%
lewis 18293
 
1.0%
miller 16821
 
0.9%
gonzalez 16809
 
0.9%
Other values (476) 1599896
86.4%
2025-04-28T17:28:27.401791image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 1122673
 
9.9%
r 941641
 
8.3%
a 926704
 
8.2%
n 869662
 
7.7%
o 832319
 
7.4%
l 698286
 
6.2%
s 696904
 
6.2%
i 622878
 
5.5%
t 412730
 
3.6%
h 327959
 
2.9%
Other values (38) 3870771
34.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 11322527
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 1122673
 
9.9%
r 941641
 
8.3%
a 926704
 
8.2%
n 869662
 
7.7%
o 832319
 
7.4%
l 698286
 
6.2%
s 696904
 
6.2%
i 622878
 
5.5%
t 412730
 
3.6%
h 327959
 
2.9%
Other values (38) 3870771
34.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 11322527
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 1122673
 
9.9%
r 941641
 
8.3%
a 926704
 
8.2%
n 869662
 
7.7%
o 832319
 
7.4%
l 698286
 
6.2%
s 696904
 
6.2%
i 622878
 
5.5%
t 412730
 
3.6%
h 327959
 
2.9%
Other values (38) 3870771
34.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 11322527
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 1122673
 
9.9%
r 941641
 
8.3%
a 926704
 
8.2%
n 869662
 
7.7%
o 832319
 
7.4%
l 698286
 
6.2%
s 696904
 
6.2%
i 622878
 
5.5%
t 412730
 
3.6%
h 327959
 
2.9%
Other values (38) 3870771
34.2%

gender
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.1 MiB
F
1014749 
M
837645 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1852394
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowF
2nd rowF
3rd rowM
4th rowM
5th rowM

Common Values

ValueCountFrequency (%)
F 1014749
54.8%
M 837645
45.2%

Length

2025-04-28T17:28:27.572094image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-28T17:28:27.702455image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
ValueCountFrequency (%)
f 1014749
54.8%
m 837645
45.2%

Most occurring characters

ValueCountFrequency (%)
F 1014749
54.8%
M 837645
45.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1852394
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
F 1014749
54.8%
M 837645
45.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1852394
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
F 1014749
54.8%
M 837645
45.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1852394
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
F 1014749
54.8%
M 837645
45.2%

street
Text

Distinct999
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size14.1 MiB
2025-04-28T17:28:27.993030image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length35
Median length29
Mean length22.231289
Min length12

Characters and Unicode

Total characters41181107
Distinct characters62
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row561 Perry Cove
2nd row43039 Riley Greens Suite 393
3rd row594 White Dale Suite 530
4th row9443 Cynthia Court Apt. 038
5th row408 Bradley Rest
ValueCountFrequency (%)
apt 468297
 
6.4%
suite 437016
 
5.9%
island 32903
 
0.4%
michael 27058
 
0.4%
islands 25611
 
0.3%
station 25602
 
0.3%
common 25585
 
0.3%
david 24853
 
0.3%
brooks 24143
 
0.3%
fields 23400
 
0.3%
Other values (1959) 6253340
84.9%
2025-04-28T17:28:28.758750image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5515414
 
13.4%
e 2561201
 
6.2%
a 2077034
 
5.0%
i 1851621
 
4.5%
t 1782137
 
4.3%
r 1576757
 
3.8%
n 1523518
 
3.7%
s 1476954
 
3.6%
l 1270600
 
3.1%
o 1251043
 
3.0%
Other values (52) 20294828
49.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 41181107
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
5515414
 
13.4%
e 2561201
 
6.2%
a 2077034
 
5.0%
i 1851621
 
4.5%
t 1782137
 
4.3%
r 1576757
 
3.8%
n 1523518
 
3.7%
s 1476954
 
3.6%
l 1270600
 
3.1%
o 1251043
 
3.0%
Other values (52) 20294828
49.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 41181107
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
5515414
 
13.4%
e 2561201
 
6.2%
a 2077034
 
5.0%
i 1851621
 
4.5%
t 1782137
 
4.3%
r 1576757
 
3.8%
n 1523518
 
3.7%
s 1476954
 
3.6%
l 1270600
 
3.1%
o 1251043
 
3.0%
Other values (52) 20294828
49.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 41181107
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
5515414
 
13.4%
e 2561201
 
6.2%
a 2077034
 
5.0%
i 1851621
 
4.5%
t 1782137
 
4.3%
r 1576757
 
3.8%
n 1523518
 
3.7%
s 1476954
 
3.6%
l 1270600
 
3.1%
o 1251043
 
3.0%
Other values (52) 20294828
49.3%

city
Text

Distinct906
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.1 MiB
2025-04-28T17:28:29.131228image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length25
Median length21
Mean length8.6526209
Min length3

Characters and Unicode

Total characters16028063
Distinct characters52
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMoravian Falls
2nd rowOrient
3rd rowMalad City
4th rowBoulder
5th rowDoe Hill
ValueCountFrequency (%)
city 30780
 
1.3%
west 27847
 
1.2%
saint 20483
 
0.9%
north 20472
 
0.9%
falls 18286
 
0.8%
new 16857
 
0.7%
mount 16098
 
0.7%
lake 16089
 
0.7%
san 14638
 
0.6%
springs 12414
 
0.5%
Other values (929) 2118136
91.6%
2025-04-28T17:28:29.705519image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 1555978
 
9.7%
a 1334959
 
8.3%
n 1173952
 
7.3%
o 1168590
 
7.3%
l 1115539
 
7.0%
r 1070587
 
6.7%
i 1007053
 
6.3%
t 855511
 
5.3%
s 637587
 
4.0%
459706
 
2.9%
Other values (42) 5648601
35.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 16028063
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 1555978
 
9.7%
a 1334959
 
8.3%
n 1173952
 
7.3%
o 1168590
 
7.3%
l 1115539
 
7.0%
r 1070587
 
6.7%
i 1007053
 
6.3%
t 855511
 
5.3%
s 637587
 
4.0%
459706
 
2.9%
Other values (42) 5648601
35.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 16028063
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 1555978
 
9.7%
a 1334959
 
8.3%
n 1173952
 
7.3%
o 1168590
 
7.3%
l 1115539
 
7.0%
r 1070587
 
6.7%
i 1007053
 
6.3%
t 855511
 
5.3%
s 637587
 
4.0%
459706
 
2.9%
Other values (42) 5648601
35.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 16028063
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 1555978
 
9.7%
a 1334959
 
8.3%
n 1173952
 
7.3%
o 1168590
 
7.3%
l 1115539
 
7.0%
r 1070587
 
6.7%
i 1007053
 
6.3%
t 855511
 
5.3%
s 637587
 
4.0%
459706
 
2.9%
Other values (42) 5648601
35.2%

state
Text

Distinct51
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.1 MiB
2025-04-28T17:28:29.926452image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters3704788
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNC
2nd rowWA
3rd rowID
4th rowMT
5th rowVA
ValueCountFrequency (%)
tx 135269
 
7.3%
ny 119419
 
6.4%
pa 114173
 
6.2%
ca 80495
 
4.3%
oh 66627
 
3.6%
mi 65825
 
3.6%
il 62212
 
3.4%
fl 60775
 
3.3%
al 58521
 
3.2%
mo 54904
 
3.0%
Other values (41) 1034174
55.8%
2025-04-28T17:28:30.273282image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 508580
13.7%
N 406389
 
11.0%
M 314756
 
8.5%
I 260547
 
7.0%
T 220136
 
5.9%
L 211461
 
5.7%
O 205755
 
5.6%
C 201235
 
5.4%
Y 188176
 
5.1%
X 135269
 
3.7%
Other values (14) 1052484
28.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 3704788
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
A 508580
13.7%
N 406389
 
11.0%
M 314756
 
8.5%
I 260547
 
7.0%
T 220136
 
5.9%
L 211461
 
5.7%
O 205755
 
5.6%
C 201235
 
5.4%
Y 188176
 
5.1%
X 135269
 
3.7%
Other values (14) 1052484
28.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 3704788
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
A 508580
13.7%
N 406389
 
11.0%
M 314756
 
8.5%
I 260547
 
7.0%
T 220136
 
5.9%
L 211461
 
5.7%
O 205755
 
5.6%
C 201235
 
5.4%
Y 188176
 
5.1%
X 135269
 
3.7%
Other values (14) 1052484
28.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 3704788
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
A 508580
13.7%
N 406389
 
11.0%
M 314756
 
8.5%
I 260547
 
7.0%
T 220136
 
5.9%
L 211461
 
5.7%
O 205755
 
5.6%
C 201235
 
5.4%
Y 188176
 
5.1%
X 135269
 
3.7%
Other values (14) 1052484
28.4%

zip
Real number (ℝ)

High correlation 

Distinct985
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean48813.258
Minimum1257
Maximum99921
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.1 MiB
2025-04-28T17:28:30.451882image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum1257
5-th percentile7208
Q126237
median48174
Q372042
95-th percentile94569
Maximum99921
Range98664
Interquartile range (IQR)45805

Descriptive statistics

Standard deviation26881.846
Coefficient of variation (CV)0.55070788
Kurtosis-1.0960542
Mean48813.258
Median Absolute Deviation (MAD)23068
Skewness0.078949647
Sum9.0421387 × 1010
Variance7.2263364 × 108
MonotonicityNot monotonic
2025-04-28T17:28:30.636613image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
82514 5116
 
0.3%
73754 5116
 
0.3%
48088 5115
 
0.3%
34112 5108
 
0.3%
61454 4392
 
0.2%
16114 4392
 
0.2%
84540 4386
 
0.2%
89512 4386
 
0.2%
72476 4386
 
0.2%
33872 4385
 
0.2%
Other values (975) 1805612
97.5%
ValueCountFrequency (%)
1257 2923
0.2%
1330 1466
0.1%
1535 734
 
< 0.1%
1545 1468
0.1%
1612 738
 
< 0.1%
1843 3652
0.2%
1844 2919
0.2%
2180 738
 
< 0.1%
2630 2924
0.2%
2908 745
 
< 0.1%
ValueCountFrequency (%)
99921 14
 
< 0.1%
99783 2203
0.1%
99747 12
 
< 0.1%
99746 734
 
< 0.1%
99323 3651
0.2%
99160 4362
0.2%
99116 15
 
< 0.1%
99113 1463
 
0.1%
99033 3646
0.2%
98836 740
 
< 0.1%

lat
Real number (ℝ)

High correlation 

Distinct983
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38.539311
Minimum20.0271
Maximum66.6933
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.1 MiB
2025-04-28T17:28:30.816551image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum20.0271
5-th percentile29.8826
Q134.6689
median39.3543
Q341.9404
95-th percentile45.8433
Maximum66.6933
Range46.6662
Interquartile range (IQR)7.2715

Descriptive statistics

Standard deviation5.0714704
Coefficient of variation (CV)0.13159214
Kurtosis0.79107707
Mean38.539311
Median Absolute Deviation (MAD)3.3597
Skewness-0.19199899
Sum71389988
Variance25.719812
MonotonicityNot monotonic
2025-04-28T17:28:31.012083image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
43.0048 5116
 
0.3%
36.385 5116
 
0.3%
42.5164 5115
 
0.3%
26.1184 5108
 
0.3%
41.3851 4392
 
0.2%
40.6761 4392
 
0.2%
36.0244 4386
 
0.2%
38.9999 4386
 
0.2%
39.5483 4386
 
0.2%
34.2853 4385
 
0.2%
Other values (973) 1805612
97.5%
ValueCountFrequency (%)
20.0271 2186
0.1%
20.0827 1463
 
0.1%
24.6557 3655
0.2%
26.1184 5108
0.3%
26.3304 741
 
< 0.1%
26.3771 732
 
< 0.1%
26.4215 4362
0.2%
26.4722 3650
0.2%
26.529 2202
0.1%
26.6939 1467
 
0.1%
ValueCountFrequency (%)
66.6933 12
 
< 0.1%
65.6899 734
 
< 0.1%
64.7556 2203
0.1%
55.4732 14
 
< 0.1%
48.8878 4362
0.2%
48.8856 2909
0.2%
48.8328 2200
0.1%
48.6669 1469
 
0.1%
48.6031 4376
0.2%
48.4786 2916
0.2%

long
Real number (ℝ)

High correlation 

Distinct983
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-90.227832
Minimum-165.6723
Maximum-67.9503
Zeros0
Zeros (%)0.0%
Negative1852394
Negative (%)100.0%
Memory size14.1 MiB
2025-04-28T17:28:31.190632image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum-165.6723
5-th percentile-119.0825
Q1-96.798
median-87.4769
Q3-80.158
95-th percentile-73.5365
Maximum-67.9503
Range97.722
Interquartile range (IQR)16.64

Descriptive statistics

Standard deviation13.747895
Coefficient of variation (CV)-0.15236867
Kurtosis1.8375586
Mean-90.227832
Median Absolute Deviation (MAD)8.1527
Skewness-1.1469188
Sum-1.671375 × 108
Variance189.00461
MonotonicityNot monotonic
2025-04-28T17:28:31.359361image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-108.8964 5116
 
0.3%
-98.0727 5116
 
0.3%
-82.9832 5115
 
0.3%
-81.7361 5108
 
0.3%
-91.0391 4392
 
0.2%
-80.1752 4392
 
0.2%
-82.7243 4391
 
0.2%
-119.7957 4386
 
0.2%
-109.615 4386
 
0.2%
-90.9288 4386
 
0.2%
Other values (973) 1805606
97.5%
ValueCountFrequency (%)
-165.6723 2203
0.1%
-156.292 734
 
< 0.1%
-155.488 1463
0.1%
-155.3697 2186
0.1%
-153.994 12
 
< 0.1%
-133.1171 14
 
< 0.1%
-124.4409 1467
0.1%
-124.2174 2195
0.1%
-124.1587 1465
0.1%
-124.1437 2198
0.1%
ValueCountFrequency (%)
-67.9503 2922
0.2%
-68.5565 1467
 
0.1%
-69.2675 743
 
< 0.1%
-69.4828 2931
0.2%
-69.9576 737
 
< 0.1%
-69.9656 4374
0.2%
-70.1031 9
 
< 0.1%
-70.239 1455
 
0.1%
-70.3001 2924
0.2%
-70.3457 2196
0.1%

city_pop
Real number (ℝ)

Distinct891
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean88643.675
Minimum23
Maximum2906700
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.1 MiB
2025-04-28T17:28:31.537355image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum23
5-th percentile139
Q1741
median2443
Q320328
95-th percentile525713
Maximum2906700
Range2906677
Interquartile range (IQR)19587

Descriptive statistics

Standard deviation301487.62
Coefficient of variation (CV)3.4011182
Kurtosis37.572846
Mean88643.675
Median Absolute Deviation (MAD)2188
Skewness5.5908046
Sum1.6420301 × 1011
Variance9.0894784 × 1010
MonotonicityNot monotonic
2025-04-28T17:28:31.714370image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
606 8049
 
0.4%
1595797 7312
 
0.4%
1312922 7297
 
0.4%
241 6578
 
0.4%
1766 6556
 
0.4%
2906700 5865
 
0.3%
302 5853
 
0.3%
198 5850
 
0.3%
276002 5849
 
0.3%
1126 5841
 
0.3%
Other values (881) 1787344
96.5%
ValueCountFrequency (%)
23 2915
0.2%
37 1469
 
0.1%
43 2920
0.2%
46 4386
0.2%
47 734
 
< 0.1%
49 1472
 
0.1%
51 1470
 
0.1%
52 740
 
< 0.1%
53 3660
0.2%
60 1472
 
0.1%
ValueCountFrequency (%)
2906700 5865
0.3%
2504700 2929
0.2%
2383912 737
 
< 0.1%
1595797 7312
0.4%
1577385 3680
0.2%
1526206 5113
0.3%
1417793 8
 
< 0.1%
1382480 2913
 
0.2%
1312922 7297
0.4%
1263321 5141
0.3%

job
Text

Distinct497
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.1 MiB
2025-04-28T17:28:31.993460image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length59
Median length38
Mean length20.232398
Min length3

Characters and Unicode

Total characters37478372
Distinct characters53
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPsychologist, counselling
2nd rowSpecial educational needs teacher
3rd rowNature conservation officer
4th rowPatent attorney
5th rowDance movement psychotherapist
ValueCountFrequency (%)
engineer 188048
 
4.6%
officer 158202
 
3.8%
manager 87837
 
2.1%
scientist 79740
 
1.9%
designer 74639
 
1.8%
surveyor 70288
 
1.7%
teacher 54865
 
1.3%
psychologist 46856
 
1.1%
research 42426
 
1.0%
editor 40958
 
1.0%
Other values (457) 3270295
79.5%
2025-04-28T17:28:33.078204image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 4003951
 
10.7%
i 3407729
 
9.1%
r 3140909
 
8.4%
a 2593110
 
6.9%
t 2547852
 
6.8%
n 2521475
 
6.7%
2261760
 
6.0%
o 2133314
 
5.7%
s 2064644
 
5.5%
c 1890653
 
5.0%
Other values (43) 10912975
29.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 37478372
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 4003951
 
10.7%
i 3407729
 
9.1%
r 3140909
 
8.4%
a 2593110
 
6.9%
t 2547852
 
6.8%
n 2521475
 
6.7%
2261760
 
6.0%
o 2133314
 
5.7%
s 2064644
 
5.5%
c 1890653
 
5.0%
Other values (43) 10912975
29.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 37478372
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 4003951
 
10.7%
i 3407729
 
9.1%
r 3140909
 
8.4%
a 2593110
 
6.9%
t 2547852
 
6.8%
n 2521475
 
6.7%
2261760
 
6.0%
o 2133314
 
5.7%
s 2064644
 
5.5%
c 1890653
 
5.0%
Other values (43) 10912975
29.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 37478372
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 4003951
 
10.7%
i 3407729
 
9.1%
r 3140909
 
8.4%
a 2593110
 
6.9%
t 2547852
 
6.8%
n 2521475
 
6.7%
2261760
 
6.0%
o 2133314
 
5.7%
s 2064644
 
5.5%
c 1890653
 
5.0%
Other values (43) 10912975
29.1%

dob
Date

Distinct984
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size14.1 MiB
Minimum1924-10-30 00:00:00
Maximum2005-01-29 00:00:00
Invalid dates0
Invalid dates (%)0.0%
2025-04-28T17:28:33.242105image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:28:33.426838image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

trans_num
Text

Unique 

Distinct1852394
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size14.1 MiB
2025-04-28T17:28:34.854989image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length32
Mean length32
Min length32

Characters and Unicode

Total characters59276608
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1852394 ?
Unique (%)100.0%

Sample

1st row0b242abb623afc578575680df30655b9
2nd row1f76529f8574734946361c461b024d99
3rd rowa1a22d70485983eac12b5b88dad1cf95
4th row6b849c168bdad6f867558c3793159a81
5th rowa41d7549acf90789359a9aa5346dcb46
ValueCountFrequency (%)
d71c95ab6b7356dd74389d41df429c87 1
 
< 0.1%
1765bb45b3aa3224b4cdcb6e7a96cee3 1
 
< 0.1%
0b242abb623afc578575680df30655b9 1
 
< 0.1%
1f76529f8574734946361c461b024d99 1
 
< 0.1%
a1a22d70485983eac12b5b88dad1cf95 1
 
< 0.1%
6b849c168bdad6f867558c3793159a81 1
 
< 0.1%
a41d7549acf90789359a9aa5346dcb46 1
 
< 0.1%
189a841a0a8ba03058526bcfe566aab5 1
 
< 0.1%
83ec1cc84142af6e2acf10c44949e720 1
 
< 0.1%
6d294ed2cc447d2c71c7171a3d54967c 1
 
< 0.1%
Other values (1852384) 1852384
> 99.9%
2025-04-28T17:28:36.375697image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 3708557
 
6.3%
4 3707696
 
6.3%
7 3707599
 
6.3%
2 3707045
 
6.3%
3 3706132
 
6.3%
1 3705118
 
6.3%
d 3704966
 
6.3%
a 3704452
 
6.2%
8 3704258
 
6.2%
c 3703707
 
6.2%
Other values (6) 22217078
37.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 59276608
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
9 3708557
 
6.3%
4 3707696
 
6.3%
7 3707599
 
6.3%
2 3707045
 
6.3%
3 3706132
 
6.3%
1 3705118
 
6.3%
d 3704966
 
6.3%
a 3704452
 
6.2%
8 3704258
 
6.2%
c 3703707
 
6.2%
Other values (6) 22217078
37.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 59276608
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
9 3708557
 
6.3%
4 3707696
 
6.3%
7 3707599
 
6.3%
2 3707045
 
6.3%
3 3706132
 
6.3%
1 3705118
 
6.3%
d 3704966
 
6.3%
a 3704452
 
6.2%
8 3704258
 
6.2%
c 3703707
 
6.2%
Other values (6) 22217078
37.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 59276608
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
9 3708557
 
6.3%
4 3707696
 
6.3%
7 3707599
 
6.3%
2 3707045
 
6.3%
3 3706132
 
6.3%
1 3705118
 
6.3%
d 3704966
 
6.3%
a 3704452
 
6.2%
8 3704258
 
6.2%
c 3703707
 
6.2%
Other values (6) 22217078
37.5%

unix_time
Real number (ℝ)

Distinct1819583
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.3586742 × 109
Minimum1.325376 × 109
Maximum1.3885344 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.1 MiB
2025-04-28T17:28:36.552759image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum1.325376 × 109
5-th percentile1.3300982 × 109
Q11.3430168 × 109
median1.3570893 × 109
Q31.3745815 × 109
95-th percentile1.3867821 × 109
Maximum1.3885344 × 109
Range63158356
Interquartile range (IQR)31564662

Descriptive statistics

Standard deviation18195081
Coefficient of variation (CV)0.013391791
Kurtosis-1.1995793
Mean1.3586742 × 109
Median Absolute Deviation (MAD)15789076
Skewness-0.019735681
Sum2.5168 × 1015
Variance3.3106099 × 1014
MonotonicityIncreasing
2025-04-28T17:28:36.726918image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1381001869 4
 
< 0.1%
1370177227 4
 
< 0.1%
1370050667 4
 
< 0.1%
1386957227 4
 
< 0.1%
1335110521 4
 
< 0.1%
1387468942 4
 
< 0.1%
1387312599 4
 
< 0.1%
1385924621 3
 
< 0.1%
1372438073 3
 
< 0.1%
1354282266 3
 
< 0.1%
Other values (1819573) 1852357
> 99.9%
ValueCountFrequency (%)
1325376018 1
< 0.1%
1325376044 1
< 0.1%
1325376051 1
< 0.1%
1325376076 1
< 0.1%
1325376186 1
< 0.1%
1325376248 1
< 0.1%
1325376282 1
< 0.1%
1325376308 1
< 0.1%
1325376318 1
< 0.1%
1325376361 1
< 0.1%
ValueCountFrequency (%)
1388534374 1
< 0.1%
1388534364 1
< 0.1%
1388534355 1
< 0.1%
1388534349 1
< 0.1%
1388534347 1
< 0.1%
1388534314 1
< 0.1%
1388534284 1
< 0.1%
1388534276 1
< 0.1%
1388534270 1
< 0.1%
1388534238 1
< 0.1%

merch_lat
Real number (ℝ)

High correlation 

Distinct1754157
Distinct (%)94.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38.538976
Minimum19.027422
Maximum67.510267
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.1 MiB
2025-04-28T17:28:36.901675image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum19.027422
5-th percentile29.753795
Q134.740122
median39.3689
Q341.956263
95-th percentile46.002013
Maximum67.510267
Range48.482845
Interquartile range (IQR)7.2161407

Descriptive statistics

Standard deviation5.1056039
Coefficient of variation (CV)0.13247897
Kurtosis0.77423362
Mean38.538976
Median Absolute Deviation (MAD)3.38992
Skewness-0.1880969
Sum71389368
Variance26.067191
MonotonicityNot monotonic
2025-04-28T17:28:37.094636image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
40.16396 4
 
< 0.1%
38.714096 4
 
< 0.1%
42.889354 4
 
< 0.1%
38.206985 4
 
< 0.1%
37.669788 4
 
< 0.1%
41.785897 4
 
< 0.1%
41.301611 4
 
< 0.1%
41.23339 4
 
< 0.1%
43.747922 4
 
< 0.1%
38.164527 4
 
< 0.1%
Other values (1754147) 1852354
> 99.9%
ValueCountFrequency (%)
19.027422 1
< 0.1%
19.027785 1
< 0.1%
19.027804 1
< 0.1%
19.027849 1
< 0.1%
19.029798 1
< 0.1%
19.031242 1
< 0.1%
19.032277 1
< 0.1%
19.032689 1
< 0.1%
19.033288 1
< 0.1%
19.034282 1
< 0.1%
ValueCountFrequency (%)
67.510267 1
< 0.1%
67.441518 1
< 0.1%
67.397018 1
< 0.1%
67.188111 1
< 0.1%
67.064277 1
< 0.1%
66.835174 1
< 0.1%
66.682905 1
< 0.1%
66.679297 1
< 0.1%
66.674714 1
< 0.1%
66.67355 1
< 0.1%

merch_long
Real number (ℝ)

High correlation 

Distinct1809753
Distinct (%)97.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-90.22794
Minimum-166.67157
Maximum-66.950902
Zeros0
Zeros (%)0.0%
Negative1852394
Negative (%)100.0%
Memory size14.1 MiB
2025-04-28T17:28:37.303950image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum-166.67157
5-th percentile-119.30928
Q1-96.89944
median-87.440694
Q3-80.245108
95-th percentile-73.365169
Maximum-66.950902
Range99.720673
Interquartile range (IQR)16.654332

Descriptive statistics

Standard deviation13.759692
Coefficient of variation (CV)-0.15249924
Kurtosis1.8312584
Mean-90.22794
Median Absolute Deviation (MAD)8.2235005
Skewness-1.143933
Sum-1.6713769 × 108
Variance189.32913
MonotonicityNot monotonic
2025-04-28T17:28:37.545976image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-87.830842 4
 
< 0.1%
-96.511763 4
 
< 0.1%
-80.940524 4
 
< 0.1%
-81.219189 4
 
< 0.1%
-90.85685 4
 
< 0.1%
-74.618269 4
 
< 0.1%
-87.116414 4
 
< 0.1%
-81.036745 4
 
< 0.1%
-81.995265 4
 
< 0.1%
-74.433003 4
 
< 0.1%
Other values (1809743) 1852354
> 99.9%
ValueCountFrequency (%)
-166.671575 1
< 0.1%
-166.671242 1
< 0.1%
-166.670685 1
< 0.1%
-166.670132 1
< 0.1%
-166.670006 1
< 0.1%
-166.66991 1
< 0.1%
-166.669812 1
< 0.1%
-166.669638 1
< 0.1%
-166.666179 1
< 0.1%
-166.664828 1
< 0.1%
ValueCountFrequency (%)
-66.950902 1
< 0.1%
-66.952026 1
< 0.1%
-66.952352 1
< 0.1%
-66.955602 1
< 0.1%
-66.955996 1
< 0.1%
-66.95654 1
< 0.1%
-66.957364 1
< 0.1%
-66.958659 1
< 0.1%
-66.958751 1
< 0.1%
-66.959178 1
< 0.1%

is_fraud
Categorical

Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.1 MiB
0
1842743 
1
 
9651

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1852394
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 1842743
99.5%
1 9651
 
0.5%

Length

2025-04-28T17:28:37.730775image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-28T17:28:37.848197image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
ValueCountFrequency (%)
0 1842743
99.5%
1 9651
 
0.5%

Most occurring characters

ValueCountFrequency (%)
0 1842743
99.5%
1 9651
 
0.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1852394
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1842743
99.5%
1 9651
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1852394
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1842743
99.5%
1 9651
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1852394
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1842743
99.5%
1 9651
 
0.5%

Interactions

2025-04-28T17:28:03.124923image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:23.113605image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:27.982772image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:33.010098image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:38.128125image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:43.444667image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:48.076623image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:53.282735image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:58.085079image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:28:03.609263image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:23.674765image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:28.782628image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:33.510341image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:39.030770image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:43.924512image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:48.615309image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:53.798126image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:58.661470image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:28:04.126607image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:24.216088image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:29.320755image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:34.041968image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:39.890394image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:44.409398image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:49.237857image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:54.443478image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:59.413293image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:28:04.648047image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:24.706624image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:29.810279image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:34.558962image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:40.410971image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:44.897031image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:49.860558image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:54.992226image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:28:00.096099image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:28:05.143215image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:25.310208image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:30.383517image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:35.048093image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:40.949716image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:45.370080image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:50.403810image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:55.572414image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:28:00.664087image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:28:05.667934image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:25.901292image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:30.922823image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:35.669056image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:41.474053image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:45.878940image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:51.040273image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:56.128778image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:28:01.158671image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:28:06.212318image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:26.451204image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:31.415232image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:36.197883image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:42.002004image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:46.397440image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:51.609892image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:56.614352image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:28:01.679086image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:28:06.788824image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:26.960572image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:31.938246image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:36.801584image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:42.502504image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:46.958590image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:52.195104image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:57.104499image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:28:02.157805image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:28:07.259728image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:27.437033image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:32.458849image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:37.371951image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:42.971432image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:47.474703image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:52.740831image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:27:57.587261image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
2025-04-28T17:28:02.641225image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Correlations

2025-04-28T17:28:37.937828image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
amtcategorycc_numcity_popgenderis_fraudlatlongmerch_latmerch_longunix_timezip
amt1.0000.019-0.001-0.0240.0010.0000.013-0.0000.013-0.000-0.0010.001
category0.0191.0000.0080.0140.0540.0670.0100.0090.0110.0090.0010.011
cc_num-0.0010.0081.0000.0490.0520.003-0.003-0.013-0.003-0.0130.0010.013
city_pop-0.0240.0140.0491.0000.0900.002-0.2640.087-0.2630.086-0.003-0.040
gender0.0010.0540.0520.0901.0000.0060.1010.0910.1030.0830.0000.116
is_fraud0.0000.0670.0030.0020.0061.0000.0380.0380.0380.0380.0220.004
lat0.0130.010-0.003-0.2640.1010.0381.0000.1050.9910.1040.001-0.162
long-0.0000.009-0.0130.0870.0910.0380.1051.0000.1050.998-0.001-0.959
merch_lat0.0130.011-0.003-0.2630.1030.0380.9910.1051.0000.1040.001-0.162
merch_long-0.0000.009-0.0130.0860.0830.0380.1040.9980.1041.000-0.001-0.957
unix_time-0.0010.0010.001-0.0030.0000.0220.001-0.0010.001-0.0011.0000.001
zip0.0010.0110.013-0.0400.1160.004-0.162-0.959-0.162-0.9570.0011.000

Missing values

2025-04-28T17:28:08.360266image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
A simple visualization of nullity by column.
2025-04-28T17:28:13.333554image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

trans_date_trans_timecc_nummerchantcategoryamtfirstlastgenderstreetcitystateziplatlongcity_popjobdobtrans_numunix_timemerch_latmerch_longis_fraud
02019-01-01 00:00:182703186189652095fraud_Rippin, Kub and Mannmisc_net4.97JenniferBanksF561 Perry CoveMoravian FallsNC2865436.0788-81.17813495Psychologist, counselling1988-03-090b242abb623afc578575680df30655b9132537601836.011293-82.0483150
12019-01-01 00:00:44630423337322fraud_Heller, Gutmann and Ziemegrocery_pos107.23StephanieGillF43039 Riley Greens Suite 393OrientWA9916048.8878-118.2105149Special educational needs teacher1978-06-211f76529f8574734946361c461b024d99132537604449.159047-118.1864620
22019-01-01 00:00:5138859492057661fraud_Lind-Buckridgeentertainment220.11EdwardSanchezM594 White Dale Suite 530Malad CityID8325242.1808-112.26204154Nature conservation officer1962-01-19a1a22d70485983eac12b5b88dad1cf95132537605143.150704-112.1544810
32019-01-01 00:01:163534093764340240fraud_Kutch, Hermiston and Farrellgas_transport45.00JeremyWhiteM9443 Cynthia Court Apt. 038BoulderMT5963246.2306-112.11381939Patent attorney1967-01-126b849c168bdad6f867558c3793159a81132537607647.034331-112.5610710
42019-01-01 00:03:06375534208663984fraud_Keeling-Cristmisc_pos41.96TylerGarciaM408 Bradley RestDoe HillVA2443338.4207-79.462999Dance movement psychotherapist1986-03-28a41d7549acf90789359a9aa5346dcb46132537618638.674999-78.6324590
52019-01-01 00:04:084767265376804500fraud_Stroman, Hudson and Erdmangas_transport94.63JenniferConnerF4655 David IslandDublinPA1891740.3750-75.20452158Transport planner1961-06-19189a841a0a8ba03058526bcfe566aab5132537624840.653382-76.1526670
62019-01-01 00:04:4230074693890476fraud_Rowe-Vandervortgrocery_net44.54KelseyRichardsF889 Sarah Station Suite 624HolcombKS6785137.9931-100.98932691Arboriculturist1993-08-1683ec1cc84142af6e2acf10c44949e720132537628237.162705-100.1533700
72019-01-01 00:05:086011360759745864fraud_Corwin-Collinsgas_transport71.65StevenWilliamsM231 Flores Pass Suite 720EdinburgVA2282438.8432-78.60036018Designer, multimedia1947-08-216d294ed2cc447d2c71c7171a3d54967c132537630838.948089-78.5402960
82019-01-01 00:05:184922710831011201fraud_Herzog Ltdmisc_pos4.27HeatherChaseF6888 Hicks Stream Suite 954ManorPA1566540.3359-79.66071472Public affairs consultant1941-03-07fc28024ce480f8ef21a32d64c93a29f5132537631840.351813-79.9581460
92019-01-01 00:06:012720830304681674fraud_Schoen, Kuphal and Nitzschegrocery_pos198.39MelissaAguilarF21326 Taylor Squares Suite 708ClarksvilleTN3704036.5220-87.3490151785Pathologist1974-03-283b9014ea8fb80bd65de0b1463b00b00e132537636137.179198-87.4853810
trans_date_trans_timecc_nummerchantcategoryamtfirstlastgenderstreetcitystateziplatlongcity_popjobdobtrans_numunix_timemerch_latmerch_longis_fraud
18523842020-12-31 23:57:1830344654314976fraud_Larkin, Stracke and Greenfelderentertainment46.71ChristineJohnsonF8011 Chapman Tunnel Apt. 568Blairsden-GraeagleCA9610339.8127-120.64051725Chartered legal executive (England and Wales)1967-05-27a7105564935ea3977dc61ff9ced3bf5e138853423838.963543-120.4571210
18523852020-12-31 23:57:503524574586339330fraud_Heathcote, Yost and Kertzmannshopping_net29.56AshleyCabreraF94225 Smith Springs Apt. 617Vero BeachFL3296027.6330-80.4031105638Librarian, public1986-05-079fc9f6f9be3182d519a61a119cf97199138853427027.593881-80.8550920
18523862020-12-31 23:57:56341546199006537fraud_Schmidt-Larkinhome12.68MarkBrownM8580 Moore CoveWalesAK9978364.7556-165.6723145Administrator, education1939-11-09a8310343c189e4a5b6316050d2d6b014138853427665.623593-165.1860330
18523872020-12-31 23:58:04501802953619fraud_Pouros, Walker and Spencerkids_pets13.02RobertFloresM3277 Fields Meadows Apt. 790GreenviewCA9603741.5403-122.9366308Call centre manager1958-09-20bd7071fd5c9510a5594ee196368ac80e138853428441.973127-123.5530320
18523882020-12-31 23:58:343523843138706408fraud_Prosacco, Kreiger and Kovacekhome17.00GraceWilliamsF28812 Charles Mill Apt. 628PlantersvilleAL3675832.6176-86.94751412Drilling engineer1970-11-206d04313bfe4b661b8ca2b6a499a320fe138853431432.164145-87.5396690
18523892020-12-31 23:59:0730560609640617fraud_Reilly and Sonshealth_fitness43.77MichaelOlsonM558 Michael EstatesLurayMO6345340.4931-91.8912519Town planner1966-02-139b1f753c79894c9f4b71f04581835ada138853434739.946837-91.3333310
18523902020-12-31 23:59:093556613125071656fraud_Hoppe-Parisiankids_pets111.84JoseVasquezM572 Davis MountainsLake JacksonTX7756629.0393-95.440128739Futures trader1999-12-272090647dac2c89a1d86c514c427f5b91138853434929.661049-96.1866330
18523912020-12-31 23:59:156011724471098086fraud_Rau-Robelkids_pets86.88AnnLawsonF144 Evans Islands Apt. 683BurbankWA9932346.1966-118.90173684Musician1981-11-296c5b7c8add471975aa0fec023b2e8408138853435546.658340-119.7150540
18523922020-12-31 23:59:244079773899158fraud_Breitenberg LLCtravel7.99EricPrestonM7020 Doyle Stream Apt. 951MesaID8364344.6255-116.4493129Cartographer1965-12-1514392d723bb7737606b2700ac791b7aa138853436444.470525-117.0808880
18523932020-12-31 23:59:344170689372027579fraud_Dare-Marvinentertainment38.13SamuelFreyM830 Myers Plaza Apt. 384EdmondOK7303435.6665-97.4798116001Media buyer1993-05-101765bb45b3aa3224b4cdcb6e7a96cee3138853437436.210097-97.0363720